Search timeouts

Although the Texpress search algorithm provides very high speed retrieval from large data sets, there are some classes of query that may take some time to execute.

In particular the following queries may execute slowly:

Search type

Explanation

Search without any index support

This class of searches includes querying on fields without index support, and wildcard searches where an index term has not been provided. For example searching for all surnames in the Parties module that start with A, that is an A* search, results in all records being retrieved and the surname data matched against A*. For a large data set this may take some time.

Range searches

While some index support is provided for range queries, it is only provided at the record descriptor level. A range search involves visiting every record descriptor and checking the range bits to see if the record falls within the search range.

Large numbers of OR terms combined with AND terms

When an OR query is executed the server logically performs one query per OR term, merging the results into one matching set. Since querying is very fast this is not a problem. However if a query contains a large number of OR terms and also some AND terms, the query optimizer arranges the query so that the resulting query is one set of OR terms where each OR contains a list of AND terms (that is it pushes OR above AND). Thus a query like:

select all from eparties where NamFirst contains 'Joe' AND (NamLast contains 'Doe' OR NamLast contains 'Smith')

is transformed by the optimizer into:

select all from eparties where (NamFirst contains 'Joe' AND NamLast contains 'Doe') OR (NamFirst contains 'Joe' AND NamLast contains 'Smith')

If a large number of OR terms and AND terms exist, the optimizer may take a considerable time to re-work the original query into a state that can be used by the search engine. Once the optimizer has finished, each OR query will be very quick (provided it contains index terms).

Since some of these queries may take some time on large data sets a facility was introduced with Texpress 8.0.028 that allows searches to be timed out: it is possible to stop a search after a given number of seconds. Support was also added to abort a search that would result in a large number of segments being retrieved (handles range queries and searches without index support).

The query timeout mechanism is supported in texforms, TexQL and texserver. Extensions have been added to texxmlserver to take advantage of the new facility.

Texpress options to control the timing out of searches include:

Timeout extensions are also available for texxmlserver: